SIMD Parallel Sparse Matrix-Vector and Transposed-Matrix-Vector Multiplication in DD Precision

نویسندگان

  • Toshiaki Hishinuma
  • Hidehiko Hasegawa
  • Teruo Tanaka
چکیده

We accelerate a double precision sparse matrix and DD vector multiplication (DD-SpMV), and its transposition and DD vector multiplication (DD-TSpMV) by using SIMD AVX2 for Krylov subspace methods. We compare some storage formats of DD-SpMV and DDTSpMV for AVX2 to eliminate performance degradation factors in CRS. Our experience indicates that BCRS4x1, with fitting block size to the SIMD register’s length, is effective.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

AVX Acceleration of DD Arithmetic Between a Sparse Matrix and Vector

High precision arithmetic can improve the convergence of Krylov subspace methods; however, it is very costly. One system of high precision arithmetic is double-double (DD) arithmetic, which uses more than 20 double precision operations for one DD operation. We accelerated DD arithmetic using AVX SIMD instructions. The performances of vector operations in 4 threads are 51-59% of peak performance...

متن کامل

Breaking the performance bottleneck of sparse matrix-vector multiplication on SIMD processors

The low utilization of SIMD units and memory bandwidth is the main performance bottleneck on SIMD processors for sparse matrix-vector multiplication (SpMV), which is one of the most important kernels in many scientific and engineering applications. This paper proposes a hybrid optimization method to break the performance bottleneck of SpMV on SIMD processors. The method includes a new sparse ma...

متن کامل

A Simd Sparse Matrix-vector Multiplication Algorithm for Computational Electromagnetics and Scattering Matrix Models

Kipadia, Nirav Harish. M.S.E.E., Purdue University. May 1994. Pi SIMD Sparse Matrix-Vector Multiplication Algorithm for Computational Electromagnetics and Scattering Matrix Models. Major Professor: Jose Fortes. A large number of problems in numerical analysis require the multiplication of a sparse matrix by a vector. In spite of the large amount of fine-grained parallelism available in the proc...

متن کامل

Efficient multithreaded untransposed, transposed or symmetric sparse matrix-vector multiplication with the Recursive Sparse Blocks format

In earlier work we have introduced the “Recursive Sparse Blocks” (RSB) sparse matrix storage scheme oriented towards cache efficient matrix-vector multiplication (SpMV ) and triangular solution (SpSV ) on cache based shared memory parallel computers. Both the transposed (SpMV T ) and symmetric (SymSpMV ) matrix-vector multiply variants are supported. RSB stands for a meta-format: it recursively...

متن کامل

Direct and Transposed Sparse Matrix-Vector Multiplication

In this paper we investigate the execution of Ab and AT b, where A is a sparse matrix and b a dense vector, using the Blocked Based Compression Storage (BBCS) scheme and an Augmented Vector Architecture (AVA). In particular, we demonstrate that by using the BBCS format, we can represent both the direct and the transposed matrix for the purposes of matrix-vector multiplication with no additional...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016